Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research

Identifieur interne : 000F55 ( Main/Exploration ); précédent : 000F54; suivant : 000F56

A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research

Auteurs : Saeed Mozaffari [Iran] ; Karim Faez [Iran] ; Farhad Faradji [Iran] ; Majid Ziaratban [Iran] ; S. Mohamad Golzan [Iran]

Source :

RBID : Hal:inria-00112676

English descriptors

Abstract

This paper presents a new comprehensive database for isolated offline handwritten Farsi/Arabic numbers and characters for use in optical character recognition research. The database is freely available for academic use. So far no such a freely database in Farsi language is available. Grayscale images of 52,380 characters and 17,740 numerals are included. Each image was scanned from Iranian school entrance exam forms during the years 2004-2006 at 300 dpi. The only restriction imposed on the writers is to write each character within a rectangular box. The number of samples in each class of the database is non-uniform corresponding to their real life distributions. Also, for comparison purposes, each dataset has been properly divided into respective training and test sets.

Url:


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research</title>
<author>
<name sortKey="Mozaffari, Saeed" sort="Mozaffari, Saeed" uniqKey="Mozaffari S" first="Saeed" last="Mozaffari">Saeed Mozaffari</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING">
<orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc>
<address>
<country key="IR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-307924" type="direct">
<org type="institution" xml:id="struct-307924" status="INCOMING">
<orgName>Amirkabir University of Technology, Tehran</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author>
<name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING">
<orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc>
<address>
<country key="IR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-307924" type="direct">
<org type="institution" xml:id="struct-307924" status="INCOMING">
<orgName>Amirkabir University of Technology, Tehran</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author>
<name sortKey="Faradji, Farhad" sort="Faradji, Farhad" uniqKey="Faradji F" first="Farhad" last="Faradji">Farhad Faradji</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING">
<orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc>
<address>
<country key="IR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-307924" type="direct">
<org type="institution" xml:id="struct-307924" status="INCOMING">
<orgName>Amirkabir University of Technology, Tehran</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author>
<name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING">
<orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc>
<address>
<country key="IR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-307924" type="direct">
<org type="institution" xml:id="struct-307924" status="INCOMING">
<orgName>Amirkabir University of Technology, Tehran</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author>
<name sortKey="Golzan, S Mohamad" sort="Golzan, S Mohamad" uniqKey="Golzan S" first="S. Mohamad" last="Golzan">S. Mohamad Golzan</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING">
<orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc>
<address>
<country key="IR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-307924" type="direct">
<org type="institution" xml:id="struct-307924" status="INCOMING">
<orgName>Amirkabir University of Technology, Tehran</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:inria-00112676</idno>
<idno type="halId">inria-00112676</idno>
<idno type="halUri">https://hal.inria.fr/inria-00112676</idno>
<idno type="url">https://hal.inria.fr/inria-00112676</idno>
<date when="2006-10-23">2006-10-23</date>
<idno type="wicri:Area/Hal/Corpus">000004</idno>
<idno type="wicri:Area/Hal/Curation">000004</idno>
<idno type="wicri:Area/Hal/Checkpoint">000129</idno>
<idno type="wicri:Area/Main/Merge">000F70</idno>
<idno type="wicri:Area/Main/Curation">000F55</idno>
<idno type="wicri:Area/Main/Exploration">000F55</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research</title>
<author>
<name sortKey="Mozaffari, Saeed" sort="Mozaffari, Saeed" uniqKey="Mozaffari S" first="Saeed" last="Mozaffari">Saeed Mozaffari</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING">
<orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc>
<address>
<country key="IR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-307924" type="direct">
<org type="institution" xml:id="struct-307924" status="INCOMING">
<orgName>Amirkabir University of Technology, Tehran</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author>
<name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING">
<orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc>
<address>
<country key="IR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-307924" type="direct">
<org type="institution" xml:id="struct-307924" status="INCOMING">
<orgName>Amirkabir University of Technology, Tehran</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author>
<name sortKey="Faradji, Farhad" sort="Faradji, Farhad" uniqKey="Faradji F" first="Farhad" last="Faradji">Farhad Faradji</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING">
<orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc>
<address>
<country key="IR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-307924" type="direct">
<org type="institution" xml:id="struct-307924" status="INCOMING">
<orgName>Amirkabir University of Technology, Tehran</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author>
<name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING">
<orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc>
<address>
<country key="IR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-307924" type="direct">
<org type="institution" xml:id="struct-307924" status="INCOMING">
<orgName>Amirkabir University of Technology, Tehran</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
<author>
<name sortKey="Golzan, S Mohamad" sort="Golzan, S Mohamad" uniqKey="Golzan S" first="S. Mohamad" last="Golzan">S. Mohamad Golzan</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-23558" status="INCOMING">
<orgName>Pattern Recognition and Image Processing Laboratory</orgName>
<desc>
<address>
<country key="IR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-307924" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-307924" type="direct">
<org type="institution" xml:id="struct-307924" status="INCOMING">
<orgName>Amirkabir University of Technology, Tehran</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>Iran</country>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="mix" xml:lang="en">
<term>Comparative database</term>
<term>Farsi/Arabic</term>
<term>OCR</term>
<term>isolated numbers and characters</term>
<term>offline</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This paper presents a new comprehensive database for isolated offline handwritten Farsi/Arabic numbers and characters for use in optical character recognition research. The database is freely available for academic use. So far no such a freely database in Farsi language is available. Grayscale images of 52,380 characters and 17,740 numerals are included. Each image was scanned from Iranian school entrance exam forms during the years 2004-2006 at 300 dpi. The only restriction imposed on the writers is to write each character within a rectangular box. The number of samples in each class of the database is non-uniform corresponding to their real life distributions. Also, for comparison purposes, each dataset has been properly divided into respective training and test sets.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Iran</li>
</country>
</list>
<tree>
<country name="Iran">
<noRegion>
<name sortKey="Mozaffari, Saeed" sort="Mozaffari, Saeed" uniqKey="Mozaffari S" first="Saeed" last="Mozaffari">Saeed Mozaffari</name>
</noRegion>
<name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
<name sortKey="Faradji, Farhad" sort="Faradji, Farhad" uniqKey="Faradji F" first="Farhad" last="Faradji">Farhad Faradji</name>
<name sortKey="Golzan, S Mohamad" sort="Golzan, S Mohamad" uniqKey="Golzan S" first="S. Mohamad" last="Golzan">S. Mohamad Golzan</name>
<name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F55 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000F55 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Hal:inria-00112676
   |texte=   A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024